Goto

Collaborating Authors

 Sergipe


Avaliação de eficiência na leitura: uma abordagem baseada em PLN

de Gois, Túlio Sousa, Freitag, Raquel Meister Ko.

arXiv.org Artificial Intelligence

The cloze test, widely used due to its low cost and flexibility, makes it possible to assess reading comprehension by filling in gaps in texts, requiring the mobilization of diverse linguistic repertoires. However, traditional correction methods, based only on exact answers, limit the identification of nuances in student performance. This study proposes an automated evaluation model for the cloze test in Brazilian Portuguese, integrating orthographic (edit distance), grammatical (POS tagging) and semantic (similarity between embeddings) analyses. The integrated method demonstrated its effectiveness, achieving a high correlation with human evaluation (0.832). The results indicate that the automated approach is robust, sensitive to variations in linguistic repertoire and suitable for educational contexts that require scalability.


Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

Bezerra, Lucas C. D., Santos, Ataíde M. G. dos, Park, Shinkyu

arXiv.org Artificial Intelligence

We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA). Our approach extends Multi-Agent Proximal Policy Optimization (MAPPO) by incorporating spatial action maps, robot motion control, task allocation revision, and intention sharing to enable effective coalition formation. Extensive simulations demonstrate that our model significantly outperforms existing methods, including a market-based baseline. Furthermore, we assess the scalability and generalizability of the proposed framework, highlighting its ability to handle large robot populations and adapt to diverse task allocation environments.


NLP and Education: using semantic similarity to evaluate filled gaps in a large-scale Cloze test in the classroom

de Gois, Túlio Sousa, Freitas, Flávia Oliveira, Tejada, Julian, Freitag, Raquel Meister Ko.

arXiv.org Artificial Intelligence

Since half past the last century, the Cloze test has been used for educational purposes to assess proficiency in understanding texts in different languages Taylor [1953], Brown [1980, 2002]. The task consists of the systematic filling in of gaps in a text, specifically a prose selection Bickley et al. [1970], previously adapted to the participant's realities, and the scores of correct answers are associated with the degree of comprehension of the text by the participant. Different measures, such as exact answer, acceptable answer Brown [1980], multiple choice, and Clozentropy Darnell [1968], Lowry and Marr [1975], have been used to assess gap-filling since Taylor's initial proposal Taylor [1953]. These measures will be further examined in Section 2. The exact answer may seem easier to calculate, especially for a Cloze test applied to large and heterogeneous groups of students with insufficient time for teachers to analyze each answer individually. In Brazil, for instance, teachers usually have to manage numerous classes, and this correction method helps to provide rapid answers to students' reading proficiency, allowing one to check the answers objectively Cunha and Santos [2010] without possible or different options.


ChatGPT as Co-Advisor in Scientific Initiation: Action Research with Project-Based Learning in Elementary Education

Villan, Fabiano, Santos, Renato P. dos

arXiv.org Artificial Intelligence

Background: In the contemporary educational landscape, technology has the power to drive innovative pedagogical practices. Overcoming the resistance of teachers and students to adopting new methods and technologies is a challenge that needs to be addressed. Objectives: To evaluate the effectiveness of ChatGPT as a co-advisor in research projects and its influence on the implementation of Project-Based Learning (PBL), as well as overcoming resistance to the use of new pedagogical methodologies. Design: An action-research methodology was employed, including unstructured interviews and the application of questionnaires via Google Forms. Setting and Participants: The research was conducted in an elementary school, involving 353 students and 16 teachers. Data Collection and Analysis: Data were gathered through observations and notes in meetings and interviews, complemented by electronic questionnaires, with quantitative and qualitative analyses performed via Microsoft Excel and Google Forms. Results: The introduction of ChatGPT as a pedagogical tool led to increased student engagement and decreased teacher resistance, reflected in recognition at local science fairs. Conclusion: The study confirmed the utility of ChatGPT in school research co-orientation, highlighting its role in facilitating PBL and promoting cultural changes in educational practice, with proactive school management identified as a catalysing element in adapting to educational innovations.


Image Classification using Combination of Topological Features and Neural Networks

Lima, Mariana Dória Prata, Giraldi, Gilson Antonio, Junior, Gastão Florêncio Miranda

arXiv.org Artificial Intelligence

In this work we use the persistent homology method, a technique in topological data analysis (TDA), to extract essential topological features from the data space and combine them with deep learning features for classification tasks. In TDA, the concepts of complexes and filtration are building blocks. Firstly, a filtration is constructed from some complex. Then, persistent homology classes are computed, and their evolution along the filtration is visualized through the persistence diagram. Additionally, we applied vectorization techniques to the persistence diagram to make this topological information compatible with machine learning algorithms. This was carried out with the aim of classifying images from multiple classes in the MNIST dataset. Our approach inserts topological features into deep learning approaches composed by single and two-streams neural networks architectures based on a multi-layer perceptron (MLP) and a convolutional neral network (CNN) taylored for multi-class classification in the MNIST dataset. In our analysis, we evaluated the obtained results and compared them with the outcomes achieved through the baselines that are available in the TensorFlow library. The main conclusion is that topological information may increase neural network accuracy in multi-class classification tasks with the price of computational complexity of persistent homology calculation. Up to the best of our knowledge, it is the first work that combines deep learning features and the combination of topological features for multi-class classification tasks.


Learning to Paraphrase Sentences to Different Complexity Levels

Chi, Alison, Chen, Li-Kuang, Chang, Yi-Chen, Lee, Shu-Hui, Chang, Jason S.

arXiv.org Artificial Intelligence

While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the other by a rule-based approach, with a single supervised dataset. Using these three datasets for training, we perform extensive experiments on both multitasking and prompting strategies. Compared to other systems trained on unsupervised parallel data, models trained on our weak classifier labeled dataset achieve state-of-the-art performance on the ASSET simplification benchmark. Our models also outperform previous work on sentence level targeting. Finally, we establish how a handful of Large Language Models perform on these tasks under a zero-shot setting.


Datasets for Portuguese Legal Semantic Textual Similarity: Comparing weak supervision and an annotation process approaches

Junior, Daniel da Silva, Corval, Paulo Roberto dos S., Paes, Aline, de Oliveira, Daniel

arXiv.org Artificial Intelligence

The Brazilian judiciary has a large workload, resulting in a long time to finish legal proceedings. Brazilian National Council of Justice has established in Resolution 469/2022 formal guidance for document and process digitalization opening up the possibility of using automatic techniques to help with everyday tasks in the legal field, particularly in a large number of texts yielded on the routine of law procedures. Notably, Artificial Intelligence (AI) techniques allow for processing and extracting useful information from textual data, potentially speeding up the process. However, datasets from the legal domain required by several AI techniques are scarce and difficult to obtain as they need labels from experts. To address this challenge, this article contributes with four datasets from the legal domain, two with documents and metadata but unlabeled, and another two labeled with a heuristic aiming at its use in textual semantic similarity tasks. Also, to evaluate the effectiveness of the proposed heuristic label process, this article presents a small ground truth dataset generated from domain expert annotations. The analysis of ground truth labels highlights that semantic analysis of domain text can be challenging even for domain experts. Also, the comparison between ground truth and heuristic labels shows that heuristic labels are useful.


Context-aware controller inference for stabilizing dynamical systems from scarce data

Werner, Steffen W. R., Peherstorfer, Benjamin

arXiv.org Artificial Intelligence

This work introduces a data-driven control approach for stabilizing high-dimensional dynamical systems from scarce data. The proposed context-aware controller inference approach is based on the observation that controllers need to act locally only on the unstable dynamics to stabilize systems. This means it is sufficient to learn the unstable dynamics alone, which are typically confined to much lower dimensional spaces than the high-dimensional state spaces of all system dynamics and thus few data samples are sufficient to identify them. Numerical experiments demonstrate that context-aware controller inference learns stabilizing controllers from orders of magnitude fewer data samples than traditional data-driven control techniques and variants of reinforcement learning. The experiments further show that the low data requirements of context-aware controller inference are especially beneficial in data-scarce engineering problems with complex physics, for which learning complete system dynamics is often intractable in terms of data and training costs.


Machine Learning Simulates Agent-Based Model Towards Policy

Furtado, Bernardo Alves, Andreão, Gustavo Onofre

arXiv.org Artificial Intelligence

Public Policies are not intrinsically positive or negative. Rather, policies provide varying levels of effects across different recipients. Methodologically, computational modeling enables the application of multiple influences on empirical data, thus allowing for heterogeneous response to policies. We use a random forest machine learning algorithm to emulate an agent-based model (ABM) and evaluate competing policies across 46 Metropolitan Regions (MRs) in Brazil. In doing so, we use input parameters and output indicators of 11,076 actual simulation runs and one million emulated runs. As a result, we obtain the optimal (and non-optimal) performance of each region over the policies. Optimum is defined as a combination of GDP production and the Gini coefficient inequality indicator for the full ensemble of Metropolitan Regions. Results suggest that MRs already have embedded structures that favor optimal or non-optimal results, but they also illustrate which policy is more beneficial to each place. In addition to providing MR-specific policies' results, the use of machine learning to simulate an ABM reduces the computational burden, whereas allowing for a much larger variation among model parameters. The coherence of results within the context of larger uncertainty--vis-\`a-vis those of the original ABM--reinforces robustness of the model. At the same time the exercise indicates which parameters should policymakers intervene on, in order to work towards precise policy optimal instruments.


On the Information Content of Predictions in Word Analogy Tests

Montalvão, Jugurta

arXiv.org Artificial Intelligence

An approach is proposed to quantify, in bits of information, the actual relevance of analogies in analogy tests. The main component of this approach is a softaccuracy estimator that also yields entropy estimates with compensated biases. Experimental results obtained with pre-trained GloVe 300-D vectors and two public analogy test sets show that proximity hints are much more relevant than analogies in analogy tests, from an information content perspective. Accordingly, a simple word embedding model is used to predict that analogies carry about one bit of information, which is experimentally corroborated.